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ABSTRACT 

Results reported by Leinhardt, Zigmond, and Cooley 
(1981) have been interpreted as support for increased silent reading 
in classroom reading instruction. G. Leinhardt and colleagues 
examined a causal model of classroom processes influencing reading 
achievement and found that time spent in silent, rather than oral, 
reading was positively related to gains in reading achievement. To 
clarify the interpretation of these results, a study reanalyzed the 
Leinhardt data, which used students in 11 elementary classrooms for 
learning disabled students. Using linear structural equation 
modeling, the reanalysis showed that students* entry-level reading 
abilities had a significant direct effect on time spent in silent 
reading, but no such effect: on time spent on oral or ** indirect** 
reading. When entry-level abilities were more adequately controlled 
by incorporating measurement error into the model, silent reading no 
longer showed a significant effect on posttest reading performance. 
Indeed, under alternative models Of this data, there was even the 
suggestion that time spent in oral reading had more effect on final 
reading achievement. These finding have important implications for 
the oral versus silent reading debate, as v/ell as for the more 
general question of the relationship between time spent in reading 
and student achievement. (Author/FL) 
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Abstract 



Results reported by Leinhardt, Zigmond, and Cooley (1981) have been interpreted as support for 
increased silent reading in classroom reading instruction. Leinhardt et al. examined a causal model of 
classroom processes influencing reading achievement and showed that time spent in silent reading, 
rather than oral reading, was positively related to gains in reading achievement. The present study 
reanalyzed the Leinhardt et al. data in an attempt to clarify the interpretation of their results. By 
means of linear structural equation modeling we show that students' entry-level reading abilities had 
a si^cant direct effect on time spent in silent reading, but no such effect on time spent on oral or 
"indirect" reading. Any attempt at examining the role of silent reading needs to take this into 
account. When entr>'-!evel abilities were more adequately controlled by incorporating measurement 
error into the model, silent reading no longer showed a significant effect on posttest reading 
performance. Indeed, under alternative models of the data, there is even the suggestion that time 
spent in oral reading had more effect on final reading achievement. These findings have important 
implications for the oral versus silent reading debate, as well as for the more general question of the 
relationship between time spent in reading and student achievement. 
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Silent Reading Reconsidered; Reinterpreting Reading Instruction and its Effects 

I have steadily endeavoured to keep my mind free so as to give up 
any hypothesis, however much beloved ... as soon as facts are shown 
to be opposed to it ... for with the exception of the Coral Reefs, I 
cannot remember a single &st-formed hypothesis which had not 
after a time to be given up or greatly modified. (Charles Darwin, 
1888, p. 83) 

Research on tin^e spent in silent reading during classroom instruction and its effect on students' 
readmg achievement is largely equivocal. In elementary school classrooms and classrooms for 
leammg-disabled students, time spent in sflent reading has been shown to be positively related to 
gains m reading achievement (Qark, 1975; Clark & Spath, 1979; Leinhardt, Zigmond, & Cooley 
1981). Li contrast, research with students in special education resource rooms has shown no 
relationship between silent reading and gains in reading achievement (Haynes & Jenkins, 1986) and 
mvestigation m secondary remedial reading classrooms has even shown the relationship to be 
negative (Stallings. 1980). In Stallings' study, gains in reading achievement were found to be 
positively associated with time spent in oral reading. 

While differences in population and associated instructional variables may provide a partial account 
of these findmgs, reviewers of this research tend to favor the results of Leinhardt et al. and interpret 
the general effects of silent reading on achievement to be positive. Leinhardt et al.'s study focused on 
classrooms for learning-disabled students and is important for at least two reasons. First, unlike 
some of the studies, the research is of high methodological quality. Leinhaidt et al. present a causal 
model of classroom processes influencing reading achievement and test their model using students 
from « number of classes. There is a high regard for validity and reliability of measurement, a priori 
specification of the theoretical model, and relatively sophisticated statistical analysis. Second, their 
results indicate that silent reading may have a large effect on student performance and that relatively 
small increments m reading time may result in substantial gains in reading achievement. It is perhaps 
because of these reasons that reviewers frequently cite the Leinhardt et al. study wnen considering 
ino, o r.^^Joc^" ^^'""^ '^^'^'"2 debate (e.g., Allington, 1983, 1984; Hiebert, 

lysi; Reutzel, 1985), as well as the more general question of the relationship between time spent in 
reaomg and student achievement (e.g., Anderson, Mason, & Shirey, 1984; Englert, 1984: Vallecorsa, 
Zigmond, & Henderson, 1985). 

The purpose of the present paper is to make the case that Leinhaidt et al.'s data do not warrant the 
conclusion concerning the merits of silent reading. In interpreting their results, special consideration 
needs to be given to the effects of measurement error on parameter estimates in regression and to 
the vanables used by Leinhardt et al. in specifying a model for explaining reading achievement. 

The Original Study 

Leinhardt et al. examined classroom processes and reading achievement in 11 elementary classrooms 
for learning-disabled (LD) students. Students in LD classes exhibit a wide range of abilities and 
instruction is usually individualized. The choice of the LD population afforded the opportunity to 
capitalize on this variation and so enhance the likelihood of obtaining stab.'e parameter estimates. 

The sample consisted of 105 students ranging in age from 6 to 12 years with a mean age of 8.7. The 
students and teachers in each classroom were observed for an average of 30 hours over a 20-week 
period, and pre- and posttest measures of reading performance were obtained for all students. 

The variables under study were determined by the structural (causal) model for explaining reading 
achievement shown in Figure 1. There are two parts to this model, and each was examined 
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sepaxately. First, the model assumed that the posttest /eading performance of a student was 
attributable to his or her pretest performance, thr < ' cr!5j> between curriculum and test, and the 
readmg behaviors m which the student engaged fq "v; ed by times spent in silent, oral, and 
indirect reading. Indirect reading included those v assumed to be related to reading but in 

which the student was not directly engaged in rc£i>. ■ print (e.g., discussing a story, circling 

pichires with a common phonetic element, listewii 'g, , .. Second, the model assumed that the 
students^ reading behaviors (total time spent in the thi: ities) could be accounted for by the 
students pretest performance, teacher instruction and aflt .ontact, and the pacing of instruction. 
Affective contact included the amount of praise (rdnf.. ■: \ received by each student and the 
acadermc focus (cognitive press) exerted by the teacher towar.. student. It is important to note that 
m the first part of the model three separate reading variables were used, whereas in the second a 
combmed readmg variable was used (the aggregate of time snent in silent, oral, and indirect reading). 

[Insert Figure 1 about hfre.] 

Leinhardt et al. tested the two parts of their model by .aultiple regression with variables being 
entered simultaneously into each equation. The results are shown in Table 1. The first regression 
indicated that posttest performance was significantly influenced by pretest, overlap, and time spent in 
silent reading, but not by fime spent in oral or indirect reading. The second regression indicated that 
total time m the three behaviors was significantly influenced by all independent variables except 
pacing. *^ 

[Insert Table 1 about here.] 

These results provide overall support for the parts of the model, and point strongly to the beneficial 
effects of silent reading. According to Leinhardt et al., "these results suggest that an average of one 
minute per day Oi additional silent reading time increases posttest performance by one point. An 
mcrease of five minutes per day would be equivalent to about one month (on a grade-equivalent 
sca^^e) of additional readmg achievement" (p. 355). Given this finding, it is easy to understand how 
reviewers might interpret this research as providing support for increased silent reading during 
classroom readmg instruction. 

Reanalysis 

The present reanalysis addresses a key problem with the interpretation of the Leinhardt et al. results. 
Because their analysis assumed error-free measurement, they were unable to control fully for the 
differential relations between pretest performance and the reading activities in which the students 
engaged. Unlike oral and indirect reading, time spent in silent reading was highly correlated with 
pretes performance (r = .63), suggesting that students' initial abilities may have had a direct effect on 
tmie allocated to silent reading; the better the entiy-Ievel ability of a student, the more likely the 
teacher might be to assign him/her to this type of activity (c£, Aliington, 1983). The extent of this 
direct effect was not assessed in the original study because Leinhardt et al. did not test their model in 
Its entirely. If measurements were error-free, the attempt at partialing-out the influence of pretest 
performance by mcludmg this variable in the equation would have controlled for this confounding 
However, to the extent that the pretest measure was fallible, it is unlikely that confounding was 
avoided (see Linn & Werts, 1982). Some portion of the posttest variance attributed to silent reading 
may have been due to the indirect effect of students' initial abilities on posttest performance. 

In the present reanalysis of the data we show that (a) students' entiy-level reading abilities did indeed 
have a significant direct effect on amount of time spent in silent reading, (b) when measurement 
error is taken into account and students' initial abilities more adequately controlled, the effect of 
silent reading on posttest performance is non-significant, though still greater than that of oral 
reading, and (c) under alternative models of the data, the effect of silent reading is substantially 
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reduced, and results may even suggest that time spent in oral reading had more effect on final 
reading achievement. 

Method 

Data Analysis 

Our analysis was undertaken using linear structural equation modeling implemented through 
LKREL VI (Joreskog & Sorbom, 1984). The input was the correlation matrix for observed 
variables. All except four of the conelations were obtained from original source documents supplied 
by William W, Cooley of the Learning Research and Development Center, University of Pittsburgh, 
and these were accurate to five decimal places. The four correlations relating indirect reading to 
teacher instruction, cognitive press, xeinforcers, and pacing could not be obtained from source 
documents and were taken from the published article. These were accurate to two decimal places. 
For all LISREL analyses, the diagonal elements of the factor variance-covariance matrix were free to 
be estimated and disturbances were assumed to be uncorrelated. All structural coefficients reported 
throughout this paper are those for standardized solutions. 

Models 

We conducted the analysis in three stages. First, the original model was tested in its entirety. This 
enabled examination of the separate effects of pretest performance on the three reading behaviors 
(not assessed by Leinhardt et al.) as well as comparison of the USREL results with those of the 
original analysis. Second, estimates of the reliability of the measures were obtained and the full 
model incorporating measurement error was tested. Finally, the effects of silent, oral, and indirect 
reading were examined under alternative models of the data. 

Measures and Reliabilities 

Pretest The pretest was a composite measure consisting of scores from six subtests of the Diagnostic 
Reading Scales (Spache, 1972) combined with the Level I Reading Subtest of the Wide Range 
Achievement Test (WRAT) (Jastak, Bijou, & Jastak, 1976). Because accurate estimation of its 
reliability was crucial to the analyses, several methods of estimation were employed. Lomax and 
Cooley (1980) report a coefficient alpha estimate of reliability for four components of the composite 
represented by the Spache subtests (using the same sample as Leinhardt et al.). When correction was 
made for the inclusion of the WRAT component, this estimate yielded a reliability of .91 (Spearman- 
Brown formula). 

The pretest measures were also administered as a posttest and so another estimate of reliability was 
provided by the correlation between scores on the composite at pretest and posttest. Tests 
administered on different occasions most likely constitute parallel or tau-equivalent forms. Although 
there were insufficient degrees of freedom to test the relative fit of the two models, LISREL analyses 
were used to examine the correlation between observed- and true-score components under the 
alternative models. The reliability estimates resulting from both models showed no departure from 
the observed correlation (.91) and, thus, agreed with the previous estimate. 

A further estimate of reliability was obtained from the intercorrelations among the seven subtests 
making up the composite reported in Lomax (1980). At pretest, the correlation matrix reported is 
for N ^ 120 and at posttest, for N = 101 (cf., the N = 105 in the Leinhardt et al. study). Treating these 
data as variance-covariance matrices of standardized scores and assuming five test components- 
based on a liberal interpretation of the construction of the composite as described in Lomax and 
Cooley (1980)-we computed coefficient alpha estimates of reliability. For the pretest, coefficient 
alpha is .93 and for the posttest, .9L The discrepancy appears due to sampling differences, rather 
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than to any inherent measurement characteristic, and the average was taken as the estimate of 
reliability (.92). 

Posttest In addition to repeating the pretest, the reading subtests of the Comprehensive Test of 
Basic Skills (CTBS, CIB/McGraw-Hill, 1974a) were used as a posttest. Level B, C, or 1 was 
administered to each student based on age and expected grade-level in reading. In order to obtain an 
overall reliability estimate, the reliabilities (KR20) for total reading scores reported in Technical 
Bulletin No. 1 (CTB/McGraw-Hill, 1974b) were averaged across levels and grades '^dthin level. This 
yielded an estimate of .94. 

Studen^ behaviors. Classroom observation was conducted using a time sampling procedure termed 
the Student-level Observation of Beginning Reading (SOBR). bctails of this procedure are 
described in Leinhardt and Seewald (1981a). The system enables the categorization of student 
readmg behaviors into direct (silent, oral) and indirect reading behaviors at the level of letters, words, 
sentences, and paragraphs. The measures of reading used in this and the original analysis were the 
number of minutes per day a student was reading sUently, reading aloud, or engaged in indirect 
reading. 

A generalizability study of the SOBR (Lomax, 1982) revealed a high level of stability and 
interobserver agreement for the instrument (coefficients of .90). However, because most of the 
estimated variance components were for single-facet designs, no further information on the reliability 
of the measures could be gleaned from this study. Despite arguments against the use of observer 
agreement indices in estimating reliability (e.g., McGaw, Wardrop, & Bunda, 1972), the coefficient of 
.90 was used as the best available estimate of reliability for the measures of silent, oral, and indirect 
reading. Importantly, there is no suggestion in the generalizability study that the reliabilities for silent 
and oral reading were appreciably different. 

Teacher behaviorj. Leinhardt et al. divided these into two areas: teacher instruction and teacher 
affective contact. Teacher instruction included model presentation, explanation, feedback, cueing, 
and monitoring, and was also recorded using the SOBR. Times spent in these activities were 
combmed into a single estimate of the number of minutes per day a student received teacher 
instruction. Affective contact included reinforcers and cognitive press. Reinforcers were measured 
as the number of positive statements received by a student per day. Cognitive press was measured as 
a rating of the degree to which a student was focused on academic material and the degree to which 
the teacher supported that focus, recorded during each observational session in which a student was 
supposed to be engaged in academic activities other than reading. Again, in the absence of any better 
information, a reliability of .90 was assumed for each of the three measures of teacher behavior. 

Overlap. Overlap was an estimate of the relationship between the curriculum content covered and 
the posttest measure of reading performance. It was measured as the number of items on the CTBS 
for which the content had been taught, and was obtained through a teacher estimate for each 
student. Unfortunately, no information on the reliability of this measure was available. Leinhardt 
and Seewald (1981b) report a correlation of .71 between this measure and a computer-based estimate 
of curriculum test overiap, using the same sample as Leinhardt et al., and this figure was taken as an 
approximation to the reliability. It is almost certainly an underestimate of the true reliability of the 
measure. 

Pacing. The pacing variable was an estimate of the rate of movement through the reading material 
and was measured by counting the number of words in texts and workbooks assigned to be read by a 
student over three consecutive days. The natural log of this variable was used in this and tiie original 
analysis. The measure was assumed to be enor-free (reliability = 1.0). 
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Original Model 

In the first stage of the analysis, Leinhardt et al.'s original model was tested in its entirety (i.e., the 
two parts combined) by specifying the three measures of student reading behaviors as separate 
endogenov,.s vrjiables. The model examines relationships among the observed variables and, hence, 
the analysis is essentialJy a path analysis. As in the Leinhardt et al. study, only the CTBS was used as' 
the criterion measure of reading performance. The posttest composite measure was not included in 
their original analysis because the measure of curriculum-test overlap was not calibrated for this 
measure. 

Figure 2 presents the path diagram and results of the analysis. The goodness-of-fit statistics for the 
original model indicate a moderately good fit to the data = 22.84, df= 10, p = .01; RMSR = .025). 
As expected, students' pretest performance has a significant direct effect on amount of time spent in 
silent readmg (t = 6.92, df =99,p< .05), but no such effecw on the amount of time spent in oral 
readmg (f = -1.02, df= 99,p > .05) or indirect reading (t = 37, df= 99, p > .05). The coefficients for the 
structural equation for the posttest are identical to the standardized regression weights obtained in 
Leinhardt et al.'s first regression equation (and this despite the use of maximum-likelihood 
estimation rather than ordinary-least-squares). 

[Insert Figure 2 about here.] 

In order to demonstrate the extent of confounding associated with students' entry-level abilities, the 
model was retested with the path between pretest and posttest removed. Predictably, this more 
restrictive model showed a significantly poorer fit to the data (difference AT^^ 55 15 df=l p< 05) 
More interesting is the change in the structural coefficients. The coefficient relating silent reading 
and posttest mcreases to .54, and the coefficient relating overlap and posttest increases to 34 There 
were no major changes in any other coefficients. Thus, when there is no control for entry-level 
reading abilities, silent reading as well as overlap absorb the variance in thrj posttest attributabic*. t' 
pretest performance. This finding reinforces our suspicion concerning the confounding of measures, 
and emphasizes the need to control fully for students' entry-level abilities. 

Model Assuming Measurement Error 

In the second stage of the analysis, the attempt was made to control more fully for entry-level ability 
by mcorporating measurement error into the model. The model was testeil with each observed 
vanable serving as an indicator of a latent variable. Factor loadings and error variance components 
of observed variables were calculated fi-om their reliabilities and the entire measurement model was 
fixed. For the pretest, the reliability calculated from the intercorrelations among the test components 
(.92) was taken as the best estimate. Again, only the CTBS was used as the posttest measure of 
reading performance. 

Figure 3 presents the path diagram and results of the analysis. Overall goodness-of-fit was 
reasonable, albeit modest (X'= 30.49, df =lO,p = .001; RMSR = .023). Of special interest is the 
change m the coefficients for the posttest structural equation. The effect of pretest on posttest (t = 
7.41, df =99,p< .05) increases, and the effect of silent reading on posttest is now smaller and non- 
significant (t = 1.2A,df= 99, p > .05), although still greater than that of oral reading (t = 1 12 df = 99 d 
> .05) or indirect reading (t = -.15, df ^ 99, p > .05). The latter two coefficients remain almost 
unchanged. ITie effect of overlap on posttest also becomes smaller and non-significant (t = 1.35, df = 
99,p > .05), again indicating that this measure was confounded with pretest performance. 

[Insert Figure 3 about here.] 
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These results are predicated on the accuracy of the estimate of pretest reliability. The estimate of .92 
was our best guess, based on the construction of the composite measure, although the actual range of 
reliability estimates was .91 to .93. In order to examine the effect of silent reading over this range and 
beyond, a series of USREL analyses was conducted in monte carlo fashion by varying the factor 
loading and error variance for the pretest and holding all other parameters in the measurement 
model at the previously established values. Pretest reliabilities were varied from 1.0 through .80 in 
decrements of .01, and the beta coefficient relating silent reading to posttest performance was noted 
at each step. The curve describing the function is shown in Figure 4. Given the range of reliability 
estimates, the curve indicates that beta for silent reading falls between .10 and .12 and that even at 
the upper-end of the range the coefficient fails to reach significance (alpha = .05). Indeed, it is only at 
a value of .18 thai beta becomes significant, and then the reliability required is .98! 

Pnsert Figure 4 aboi?-. ..Jre.] 

It could be argued that these values of beta may be unduly influenced by the error variance 
components in the other measures. If the reliabilities of these other measures have been incorrectly 
estimated, then our finding concerning the status of the beta coefficient for silent reading might be in 
error. An extreme test of this possibility was provided by assuming error-firee measurement in all 
measures except the pretest, and again performing the series of monte carlo runs for changes in the 
reliability of the pretest (1.0 through .80). The curve describing the function is also given in Figure 4. 
For reliabilities above .86, the magnitude of beta is attenuated due to the effects of measurement 
error. Moreover, a reliability of .99 is now needed if the coefficient is to reach significance (alpha = 
.05). To the extent that there is measurement error in the other measures, the standard error of the 
estimate (beta) is increased, and our basic finding concerning the status of the silent reading 
coefficient remains the same. 



Alternative Models 



In the third stage of the analysis, we decided to examin-; the efff;ot of removing the measure of 
curriculum-test overlap. This measure had been shown '.o be confounded with stud its' pretest 
performance in the previous analyses and, in any case, there was little substantive interest in the 
vanable. It was included in the Leinhardt et al. model as a control for differences in content coverage 
of Items on the CTBS. The decision to remove overiap was further prompted by doubts about its 
adequacy as an estimate of the content covered in relation to the items on the CTBS. Inspection of 
the correlations among observed variables revealed that ovcriap was correlated just as highly with the 
posttest composite (.51) as it was with the CTBS (.50), despite the fact that it had been calibrated 
only for items on the latter measure. 



The removal of overiap from the model assuming error-firee measurement (our original model) 
yielded a ^ of 5.95 (d/= 7,p = .55; RMSR = .017) and from the model assuming measurement error, 
a ^ of 9.05 idf= 7, p = .25; RMSR = .018). Clearly, when these results are compared with those for 
the two corresponding models in which overiap was included it is apparent that this modification 
resulted in a markedly superior fit to the data (difference = 16.89, df=3p< 05 for models 
assuming error-free measurement; difference = 21.44, df = 3, p < .05 for models assuming 
measurement error). Moreover, there were no major changes in the structural coefficients in either 



case 



With the overiap variable removed, ii now made sense to examine the use of the alternative measure 
of posttest reading performance (the composite measure). The posttest composite correlated more 
highly with other observed variables than did the CTBS, and was thought to provide a better criterion 
measure. 



[Insert Figure 5 about here,] 
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The use of the posttest composite was examined in a series of regression models because correlations 
with this composite were available only for variables in the posttest structural equation. These 
regression models correspond to Leinhardt et al/s first regression equation (predicting the posttest) 
except that in the present analyses measurement error was incorporated using the previously 
established estimates of reliability. Three models were run (see Figure 5). The first model used only 
the CTBS as the criterion (reliability = .94), and was analyzed to provide an appropriate basis for 
companson for the other two models. The second model used only the posttest composite as the 
criterion (reliability = .92). The third model used both the CTBS and the composite as multiple 
indicators of posttest performance. The latent variable so constructed was thought to provide the 
most valid and reliable criterion. In t. i analysis, the error variance of the posttest composite 
measure was fixed according to its reliability (.92), and the variance of the latent variable was scaled 
to that of the true-score component of the composite, leaving the factor loading and error variance of 
the CTBS firee to be estimated. The results of all three analyses appear in Table 2. 

[Insert Table 2 about here.] 

At issue here is the magnitude of the parameter estimates rather than the goodness-of-fit of the 
models (the measurement parameters for the first two regression models are fixed and df = 0). The 
results firom the first model show only minor changes firom those for the full model. The results ft-om 
the second and third models, on the other har^d, show that the effect of silent reading is reduced. 
Indeed, under the most favorable conditions for valid and reliable measurement of the criterion 
(model 3), it is the coefficient for oral reading which approaches significance, perhaps suggesting that 
time spent in oral reading had more effect on final reading achievement. The pattern of results 
obtained in these three analyses showed no change even when overlap was retained in the models. 



Conclusion 

The conclusion to be dtawn firom this reanalysis seems inescapable. Contrary to Leinhardt et al.'s 
finding, there is no persuasive evidence that silent reading had an effect on students' reading 
achievement. As expected, students* entry-level abilities had a significant direct effect on time 
allocated to silent reading but no such effect on time allocated to oral or indirect reading. Any 
attempt at examining the role of silent reading, therefore, needs to take this into account When 
measurement error was incorporated into the model and initial abilities more adequately controlled, 
silent reading no longer showed a significant effect on posttest performance. Under alternative 
models of the data, there is even the suggestion that oral reading may have had more effect on final 
reading achievement. 

TTie contrast between the results firom Leinhardt et al. and our own analysis cannot be attributed to 
differences in the methods of estimation. In testing the original model, we were able to reproduce 
exactly the coefficients obtained by Leinhardt et al. in their first regression equation. Nor can the 
result easily be attributed to inaccurate estimates of the reliability of the measures. In testing the 
model with measurement error, the coefficient for silent reading failed to reach significance even 
when we allowed for some slippage in our estimate of pretest reliability and error-fi-ee measurement 
in the other measures. Lest this seem to be playing games with an arbitrary criterion for significance, 
the analysis of the alternative models showed that the coefficient is not only non-significant bufalso 
relatively small. Our regression models are comparable to Leinhardt et al.'s first regression equation 
and show the beta coefficient relating silent reading to posttest performance to be less than half the 
size of their original estimate. 

The implications of these results for the research literature are substantia!. Unfortunately, the data 
provided by Leinhardt et al. do not warrant the conclusion concerning the merits of silent reading 
over ora! readinr as suggested in reviews by Allington (1983, 1984), Hiebert (1983), Reutzel (1985) 
and others. At best, such an interpretation gives sanction to a very fragile finding. At worst, it is 
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probably wrong. More generally, the results of the reanalysis also call into question the 
interpretations of the data with respect to the relationship between time spent in reading and student 
achievement (e.g., Anderson, Mason, & Shirey, 1984; Englert, 1984; Vallecorsa, Zigmond, & 
Henderson, 1985). The finding in the present study concerning the effect of oral reading is only 
tentative and no firm conclusion can be drawn from it 

The present reanaiysis is not without its limitations. The raw data from the original studv were not 
available and so we had to assume the distributions of observed variables were approximately normal 
Judgmg by the mean and standard deviation, the distribution of silent reading time (mean= 13 68, SD 
- 8.82) may be slightly skewed (cf., oral reading rime; mean = 13.40, SD = 7.52). Hence, our analysis 
may have slightly underestimated the relationship between silent reading and posttest performance 
While tms is possible, the extreme sensitivity of the silent reading coefficient to the effects of 
measurement error in the model (see Figure 4) suggests that skewness alone cannot account for the 
poor showing of the effect of silent reading. 

Few people will be as disappointed in these results as we were. Given the proirJse of substantial gains 
to be made from mcreased silent reading in classroom reading instinction, it is indeed unfortimate 
that no convincmg evidence for its positive effect can be found (see also, Clark, 1975; Clark & Spath 
1979; Haynes & Jenkins, 1986; Stallings, 1980). Other researchers concerned with the potential 
merits of silent over oral reading may find it hard to accept such an outcome. When facts and favor 
are at odds, however, empiricist traditions necessitate our reliance on the data. The words of Charles 
Darwm (1888) quoted at the beginning of this article provide a vivid reminder in this regard: "I have 
steadily endeavoured to keep my mLnd free so as to give up any hypothesis, however much beloved 
as soon as facts are shown to be opposed to it ... for with the exception of the Coral Reefs, I cannot 
remember a smgle first-formed hypothesis which had not after a time to be given up or greatly 
modified" (p. 83). To be sure, there are veiy few "coral reefs" in educational research--and the merits 
of silent over oral reading is probably not one of them. Having cast doubt on the findings of one of 
the better studies, the need for good empirical work in the area is now even more urgent than before 
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Table 1 



Results from Regression Analyses by Leinhardt et al. (1981) 



Regression Weights 
Raw Standardized 



Std Error 
of Raw 



Equation 1: Predicting Posttest 

Pretest 6.24 

Overlap .40 

Silent reading i.OO 

Oral reading ^0 

Indirect reading -.09 



.66 
.18 
.15 
.06 
-.02 

Adjusted R^ = .72 



Equation 2: Predicting Total Reading Behaviors 

Pretest .70 .23 

Teacher instruction 1.03 .44 

Reinforcers .04 35 

Cognitive Press 5.02 .22 

Pacing 2.54 .10 

Adjusted R^ = .59 



.67 
.12 
.47 
.43 
27 



11 

.16 

.01 

1.63 

1.98 



86.2 
10.2* 

4.5 

u 
.1 



10.1 

43.4* 

26.4* 

9.4* 

1.6 



p<.05 



ERIC 
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Table 2 

Results from Three Regression Models with Overlap Removed 
and Incorporating Measurement Error (Standardized Solution) 



Structural 
CoefBcient 



Std 
Error 



Model 1: Predicting CTBS 
Pretest 

Silent Reading 
Oral reading 
Indirect reading 

Model 2: Predicting Composite 
Pretest 
Silent reading 
Oral reading 
Indirect reading 



.81 
.10 
.09 
-.01 



.94 
.05 
.09 
.05 



Model 3: Predicting CTBS & Composite Latent Variable 
Pretest .92 
Silent reading .07 
Oral reading .09 
Indirect reading .03 



.09 
.09 
.06 
.06 



.07 
.07 
.05 
.05 

.07 
.07 
.05 
.04 



9.04 
1.05 
1.38 
-.22 



13.81 
.77 
1.83 
1.10 



13.26 
.98 

1.87 
.70 



ERIC 



p<.05 
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Figure Captions 

FIGURE 1. Causal model for explaining reading achievement analyzed by Leinhardt. Zigmond. 
and Cooley (1981). ^ 

FIGURE 2. Standardized solution for original model (* E < .05, numbers in parentheses are 
residual variances). 

FIGURE 3. Standardized solution for model assuming measurement error (* e < .05, numbers in 
parentheses are residual variances). 

FIGURE 4. Beta coefficient for silent reading as a function of reliability of pretest. 

FIGURE 5. Three regression models with overlap removed and incorporating measurement 
error. 
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